150 research outputs found
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs
In this work we present In-Place Activated Batch Normalization (InPlace-ABN)
- a novel approach to drastically reduce the training memory footprint of
modern deep neural networks in a computationally efficient way. Our solution
substitutes the conventionally used succession of BatchNorm + Activation layers
with a single plugin layer, hence avoiding invasive framework surgery while
providing straightforward applicability for existing deep learning frameworks.
We obtain memory savings of up to 50% by dropping intermediate results and by
recovering required information during the backward pass through the inversion
of stored forward results, with only minor increase (0.8-2%) in computation
time. Also, we demonstrate how frequently used checkpointing approaches can be
made computationally as efficient as InPlace-ABN. In our experiments on image
classification, we demonstrate on-par results on ImageNet-1k with
state-of-the-art approaches. On the memory-demanding task of semantic
segmentation, we report results for COCO-Stuff, Cityscapes and Mapillary
Vistas, obtaining new state-of-the-art results on the latter without additional
training data but in a single-scale and -model scenario. Code can be found at
https://github.com/mapillary/inplace_abn
Best Sources Forward: Domain Generalization through Source-Specific Nets
A long standing problem in visual object categorization is the ability of algorithms to generalize across different testing conditions. The problem has been formalized as a covariate shift among the probability distributions generating the training data (source) and the test data (target) and several domain adaptation methods have been proposed to address this issue. While these approaches have considered the single source-single target scenario, it is plausible to have multiple sources and require adaptation to any possible target domain. This last scenario, named Domain Generalization (DG), is the focus of our work. Differently from previous DG methods which learn domain invariant representations from source data, we design a deep network with multiple domain-specific classifiers, each associated to a source domain. At test time we estimate the probabilities that a target sample belongs to each source domain and exploit them to optimally fuse the classifiers predictions. To further improve the generalization ability of our model, we also introduced a domain agnostic component supporting the final classifier. Experiments on two public benchmarks demonstrate the power of our approach
Robust Place Categorization With Deep Domain Generalization
Traditional place categorization approaches in robot vision assume that training and test images have similar visual appearance. Therefore, any seasonal, illumination, and environmental changes typically lead to severe degradation in performance. To cope with this problem, recent works have been proposed to adopt domain adaptation techniques. While effective, these methods assume that some prior information about the scenario where the robot will operate is available at training time. Unfortunately, in many cases, this assumption does not hold, as we often do not know where a robot will be deployed. To overcome this issue, in this paper, we present an approach that aims at learning classification models able to generalize to unseen scenarios. Specifically, we propose a novel deep learning framework for domain generalization. Our method develops from the intuition that, given a set of different classification models associated to known domains (e.g., corresponding to multiple environments, robots), the best model for a new sample in the novel domain can be computed directly at test time by optimally combining the known models. To implement our idea, we exploit recent advances in deep domain adaptation and design a convolutional neural network architecture with novel layers performing a weighted version of batch normalization. Our experiments, conducted on three common datasets for robot place categorization, confirm the validity of our contribution
Learning Deep NBNN Representations for Robust Place Categorization
This paper presents an approach for semantic place categorization using data
obtained from RGB cameras. Previous studies on visual place recognition and
classification have shown that, by considering features derived from
pre-trained Convolutional Neural Networks (CNNs) in combination with part-based
classification models, high recognition accuracy can be achieved, even in
presence of occlusions and severe viewpoint changes. Inspired by these works,
we propose to exploit local deep representations, representing images as set of
regions applying a Na\"{i}ve Bayes Nearest Neighbor (NBNN) model for image
classification. As opposed to previous methods where CNNs are merely used as
feature extractors, our approach seamlessly integrates the NBNN model into a
fully-convolutional neural network. Experimental results show that the proposed
algorithm outperforms previous methods based on pre-trained CNN models and
that, when employed in challenging robot place recognition tasks, it is robust
to occlusions, environmental and sensor changes
AdaGraph: Unifying Predictive and Continuous Domain Adaptation through Graphs
The ability to categorize is a cornerstone of visual intelligence, and a key
functionality for artificial, autonomous visual machines. This problem will
never be solved without algorithms able to adapt and generalize across visual
domains. Within the context of domain adaptation and generalization, this paper
focuses on the predictive domain adaptation scenario, namely the case where no
target data are available and the system has to learn to generalize from
annotated source images plus unlabeled samples with associated metadata from
auxiliary domains. Our contributionis the first deep architecture that tackles
predictive domainadaptation, able to leverage over the information broughtby
the auxiliary domains through a graph. Moreover, we present a simple yet
effective strategy that allows us to take advantage of the incoming target data
at test time, in a continuous domain adaptation scenario. Experiments on three
benchmark databases support the value of our approach.Comment: CVPR 2019 (oral
AutoDIAL: Automatic DomaIn Alignment Layers
Classifiers trained on given databases perform poorly when tested on data
acquired in different settings. This is explained in domain adaptation through
a shift among distributions of the source and target domains. Attempts to align
them have traditionally resulted in works reducing the domain shift by
introducing appropriate loss terms, measuring the discrepancies between source
and target distributions, in the objective function. Here we take a different
route, proposing to align the learned representations by embedding in any given
network specific Domain Alignment Layers, designed to match the source and
target feature distributions to a reference one. Opposite to previous works
which define a priori in which layers adaptation should be performed, our
method is able to automatically learn the degree of feature alignment required
at different levels of the deep network. Thorough experiments on different
public benchmarks, in the unsupervised setting, confirm the power of our
approach.Comment: arXiv admin note: substantial text overlap with arXiv:1702.06332
added supplementary materia
Boosting Deep Open World Recognition by Clustering
While convolutional neural networks have brought significant advances in
robot vision, their ability is often limited to closed world scenarios, where
the number of semantic concepts to be recognized is determined by the available
training set. Since it is practically impossible to capture all possible
semantic concepts present in the real world in a single training set, we need
to break the closed world assumption, equipping our robot with the capability
to act in an open world. To provide such ability, a robot vision system should
be able to (i) identify whether an instance does not belong to the set of known
categories (i.e. open set recognition), and (ii) extend its knowledge to learn
new classes over time (i.e. incremental learning). In this work, we show how we
can boost the performance of deep open world recognition algorithms by means of
a new loss formulation enforcing a global to local clustering of class-specific
features. In particular, a first loss term, i.e. global clustering, forces the
network to map samples closer to the class centroid they belong to while the
second one, local clustering, shapes the representation space in such a way
that samples of the same class get closer in the representation space while
pushing away neighbours belonging to other classes. Moreover, we propose a
strategy to learn class-specific rejection thresholds, instead of heuristically
estimating a single global threshold, as in previous works. Experiments on
RGB-D Object and Core50 datasets show the effectiveness of our approach.Comment: IROS/RAL 202
Apport de l’électromyographie de surface en tennis : proposition d’une nouvelle méthode de normalisation des muscles du membre supérieur : influence de la vitesse et de la fatigue sur l’activité musculaire du membre supérieur en tennis
The main purpose of this thesis is the study of upper limb muscle activity through surface electromyography (EMG) during a dynamic activity. An initial study showed that seven out of nine muscles can be normalized from two maximum dynamic tasks, while two other muscles require the traditional isometric method. This procedure helps to improve the reliability of the upper limb EMG while reducing the time of standardization. On the other hand, the study of the relationship between EMG and stroke velocity in forehand drive in tennis emphasized the changes in EMG amplitude and activation timing of some muscles in response to the increase of the ball velocity. Otherwise, a third study showed that fatigue generated by intense exercise tennis results in a decrease in activation level of the pectoralis major and the forearm muscles during strokes, without any change in activation timing. This decrease in EMG activity could explain the performance degradation observed during this experiment. However, strategies of organism protection and/or gestion of the speed-accuracy trade-off should be considered and may need future studiesL'objet principal de cette thèse est l'étude de l'activité musculaire du membre supérieur par le biais de l'électromyographie de surface (EMG) lors d'une activité dynamique. Une première étude a montré que sept muscles sur neuf peuvent être normalisés à partir de deux tâches maximales dynamiques, tandis que deux autres muscles doivent l'être avec la méthode traditionnelle isométrique. Cette procédure contribue à l'amélioration de la fiabilité de l'étude du membre supérieur tout en réduisant le temps de normalisation. D'autre part, l'étude de la relation entre activité EMG et vitesse de frappe en coup droit a permis de mettre en lumière les modifications d'amplitude EMG et des paramètres temporels d'activation de certains muscles du membre supérieur en réponse à l'augmentation de la vitesse de balle. Par ailleurs, une troisième étude a démontré que la fatigue générée par un exercice intense de tennis entraîne une baisse du niveau d'activation du grand pectoral et des muscles de l'avant-bras lors des frappes, sans toutefois entraîner de changement au niveau du timing d'activation. Cette diminution de l'activité EMG pourrait expliquer la dégradation de la performance relevée lors de cette expérience. Toutefois, des stratégies de protection de l'organisme et/ou de gestion du conflit vitesse-précision sont à envisager et ouvrent la voie à de futures étude
- …